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PERFORMANCE  COMPARISONS  FOR  TWO  VERSIONS  OF  THE 

STANIFORTH-MITCHELL  BAROTROPIC 
NUMERICAL  WEATHER  PREDICTION  CODE 

I ntroduct ion 

Reference  1  proposed  improvements  to  the  Stani f or th-Mi tche 1 1 
(Ref.  2)  barotropic  finite  element  code.  It  was  anticipated  that 
these  improvements  would  effect  significant  reductions  in  compu- 
tation time.  Reported  here  are  comparisons  of  calculated  results 
and  times  required  to  perform  key  sets  of  calculations  using  the 
"original"  code  with  corresponding  times  for  an  "amended"  code 
which  incorporates  the  proposed  changes.  The  test  problem  uses  a 
periodic  East-West  boundary  condition,  walls  on  the  North  and 
South  boundaries,  and  a  12x13  grid.  Timing  data  were  obtained 
using  an  IBM  4381-M1  processor  operating  in  the  batch  mode.  (It 
was  found  that  the  system  utility  routines  SETIME  and  GETIME  do 
not  give  useful  results  when  used  in  the  time-sharing  mode.) 

Timing  Results 

Ei  genprob 1  em.  The  first  comparison  is  for  the  determination  of 
eigenvalues  and  eigenvectors  for  the  x-direction  "stiffness"  and 
"mass"  matrices.  In  the  original  program  the  relevant  routines 
are  MTRXC  and  EIGEN2,  both  called  in  EBVSET.  In  the  amended 
version  the  routines  are  SETABX,  SETD2N,  and  PEREIG,  all  called 
by  SETUP.  Within  PEREIG  there  are  3  calls  to  EIGVCP  and  2  calls 
to  RAYLYP.  The  time  for  the  original  version  is  104  milliseconds 
and  for  the  amended  version  is  34  milliseconds.  Because  the  eig- 
enproblem  is  solved  4  times  in  the  original  program  and  only  3 
times  in  the  amended  version,  the  overall  comparison  is  between 
415  milliseconds  and  101  milliseconds. 


Time  Step.  For  each  time  step  the  Helmholtz  equation  is  solved 
once.  In  the  original  version  an  iterative  scheme  devised  by 
Concus  and  Golub  is  used,  whereas  the  amended  version  uses  a 
direct  solution.  When  only  a  single  iteration  is  employed  in  the 
original  version,  the  difference  in  solution  times  is  entirely  a 
result  of  the  extensive  rearrangement  of  the  right-hand  side  of 
the  Helmholtz  equation  that  occurs  in  the  Concus  and  Golub 
scheme.  This  rearrangement  is  effected  in  SOLHEL.  It  is  believ- 
ed that  the  most  useful  comparison  is  between  the  times  required 
for  a  single  execution  of  the  routine  TSTEP.  For  the  original 
version  this  time  is  372  milliseconds  and  for  the  amended  version 
it  is  331  milliseconds  -  a  saving  of  11%. 


Remarks 

In  producing  a  version  of  the  barotropic  code  incorporating 
the  changes  proposed  in  Ref.  1  it  was  discovered  that  calculated 
results  diverged  from  those  of  the  original  version.  It  was  not 
immediately  obvious  which  results  were  better.  The  likely  cause 
seemed  to  be  differences  between  the  two  solutions  to  the  eigen- 
problem.  A  comparison  was  made  by  constructing  the  product  of 
the  back- trans  format i on  matrix  with  the  forward  transformation 
matrix  for  each  version.  Since  this  product  should  yield  the 
identity  matrix,  measures  of  the  quality  of  the  transforms  were 
obtained  by  finding  the  standard  deviation  (from  unity)  for  the 
elements  of  the  principal  diagonal  and  the  standard  deviation 
(from  zero)  for  the  off-diagonal  elements.  For  the  diagonal  ele- 
ments the  magnitude  for  the  original  version  was  3.5E-06  and  for 
the  amended  version  it  was  5E-06.  For  the  off-diagonal  elements 
the  disparity  was  greater:  5E-07  for  the  original  and  25E-07  for 
the  amended  version. 


Since  the  e i genso 1 ut i on  of  the  amended  version  is  iterative- 
ly  improved  by  successive  calls  to  RAYLYP  and  EIGVCP,  efforts 
were  made  to  improve  results  by  additional  calls  to  this  pair  of 
subroutines.  Although  this  did  result  in  addition  of  one  cycle 
beyond  the  single  cycle  recommended  in  Ref.  1,  no  further  im- 
provement seems  to  be  possible  using  single-precision  arithmetic. 

In  order  to  resolve  doubts  concerning  the  ei genso 1 ut i on 
scheme  of  the  amended  version,  a  separate  double-precision  pro- 
gram was  written  to  compare  the  eigenvectors  and  eigenvalues  with 
exact  results.  Using  3  cycles  of  improvement,  both  eigenvectors 
and  eigenvalues  were  accurate  to  15  decimal  digits. 

Cone  1  us  ions 

In  Ref.  1  attention  was  directed  to  the  marginal  accuracy 
of  single-precision  arithmetic  for  this  program.  The  present 
study  seems  to  have  encountered  further  evidence  in  support  of 
this  observation.  Although  the  amended  version  shows  some  reduc- 
tion in  computation  time,  its  adoption  cannot  be  recommended 
without  first  verifying,  by  use  of  double-precision  arithmetic, 
that  it  does  indeed  give  results  compatible  with  those  of  the 
original  version. 

While  the  present  study  was  proceeding,  a  new  possibility 
for  major  improvement  in  computational  efficiency  came  to  atten- 
tion. In  Ref.  3  Temperton  and  Staniforth  describe  a  new  time  in- 
tegration algorithm  which  they  call  semi -Lagrangian,  semi-implic- 
it. They  show  that  it  allows  much  longer  time  steps  than  the 
present  scheme  while  still  giving  equal  accuracy.  Serious  con- 
sideration should  be  given  to  incorporating  the  new  algorithm  in 
the  barotropic  program. 
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